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Games on graphs provide a natural model for reactive non-terminating systems. In such games, the 
interaction of two players on an arena results in an infinite path that describes a run of the system. 
Different settings are used to model various open systems in computer science, as for instance turn- 
based or concurrent moves, and deterministic or stochastic transitions. In this paper, we are interested 
in turn-based games, and specifically in deterministic parity games and stochastic reachability games 
(also known as simple stochastic games). We present a simple, direct and efficient reduction from 
deterministic parity games to simple stochastic games: it yields an arena whose size is linear up to a 
logarithmic factor in size of the original arena. 

Keywords. Stochastic games, parity objectives, reachability objectives. 

1 Introduction 

Graph games. Graph games are used to model reactive systems. A finite directed grapii, wiiose vertices 
represent states and edges represent transitions, models the system. Its evolution consists in interactions 
between a controller and the environment, which is naturally turned into a game on the graph between 
two players, Eve and Adam. In the turn-based setting, in each state of the system, either the controller 
chooses the evolution of the system (the corresponding vertex is then controlled by Eve), or the system 
evolves in an uncertain way, then aiming at the worst-case scenario Adam controls the corresponding 
vertex. This defines a 2-player arena as a finite directed graph and a partition of the vertex set into Eve 
and Adam vertices. However, in many applications, systems are randomized, leading to the definition of 
stochastic arenas: in addition to Eve and Adam vertices, the graph also has random vertices where the 
evolution is chosen according to a given probability distribution. 

A pebble is initially placed on the vertex representing the initial state of the system, then Eve, Adam 
and random move this pebble along the edges, constructing an infinite sequence of vertices. The sequence 
built describes a run of the system: Eve tries to ensure that it satisfies some specification of the system, 
while Adam tries to spoil it. 

Parity objectives. The theory of graph games with w-regular winning conditions is the foundation for 
modelling and synthesizing reactive processes with fairness constraints. The parity objectives provide 
an adequate model, as the fairness constraints of reactive processes are a)-regular, and every G)-regular 
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winning condition can be specified as a parity objective lH. We consider 2-player games with parity 
objectives: deciding the winner in polynomial time is a longstanding open question, despite many efforts 
from a large community. The best known upper-bound is UPPlcoUP |6|. 

Simple stochastic games. Considering probabilistic games instead of deterministic allows the descrip- 
tion much more reactive systems by modelling uncertainty, but leads to higher complexity for corre- 
sponding decision problems. We consider stochastic games with reachability objectives: a given vertex 
is distinguished, and Eve tries to reach it. Those games were introduced by Condon, and named simple 
stochastic games [4|. We consider the following decision problem: can Eve ensure to reach the target 
vertex with probability more than half? As for the above decision problem, the best known upper-bound 
is NPncoNP 141. 

Reduction: from parity games to simple stochastic games. The notion of reduction between games 
is an important aspect in the study of games as it allows to understand which classes of games are 
subsumed by others. A classical reduction of 2-player parity games to simple stochastic games is through 
a sequence of three reductions: (a) from 2-player parity games to 2-player mean-payoff (or limit-average) 
games ; (b) from 2-player mean-payoff games to 2-player discounted-payoff games S ; and (c) from 
2-player discounted-payoff games to stochastic reachability games f9l. The sequence of reductions yields 
the following result: given a 2-player parity game with n vertices, m edges, and a parity objective with 
d priorities, the simple stochastic game obtained through the sequence of reductions has n + m vertices, 
including m probabilistic ones, 4 • m edges and the size of the arena is 0{m ■ d ■ log(?i)). 

Our results: we present a direct reduction of 2-player parity games to simple stochastic games, and thus 
show that one can discount the step of going through mean-payoff and discounted games. Moreover, our 
reduction is more efficient: given a 2-player parity game with n vertices, m edges, and a parity objective 
with d priorities, the simple stochastic game obtained by our direct reduction has n + m vertices among 
which m are probabilistic, 3 • m edges and the size of the arena is 0{m ■ log(?i)). Finally, we conclude 
following proof ideas from f5l that the decision problem for simple stochastic games is in UP n coUP, 
and from ||2] |3l we obtain that the decision problems in stochastic parity, mean-payoff and discounted 
games all are in UP n coUP. 

2 Definitions 

Given a finite set A, a probability distribution on A is a function : A — > [0, 1] such that Y^aeA M (a) = 1- 
We denote by i^(A) the set of all probability distributions on A. 

Stochastic arena. A stochastic (or 21/2-player) arena G = {{V,E),{Ve,Va,Vr),5) consists of a finite 
directed graph {V,E) with vertex set V and edge set E, a partition {Ve,Va,Vr) of the vertex set V and a 
probabilistic transition function 5 : — >• ^(V) that given a vertex in Vr gives the probability of transition 
to the next vertex. Eve chooses the successor of vertices in Vg, while Adam chooses the successor of 
vertices in Va', vertices in Vr are random vertices and their successor is chosen according to 5. We assume 
that for all u GVr and v € V we have (m, v) G £ if and only if 5 {u) (v) > 0. We assume that the underlying 
graph has no deadlock: every vertex has a successor. The special case where Vs = corresponds to 
2-player arenas (for those we omit 5 from the description of the arena). 

Size of an arena. The size of a stochastic arena G = {{V,E),{Ve,Va,Vr),5) is the number of bits required 
to store it: 



size(G) = log(«) + 2 •m-log(«) + « + •log(?i) + 



size (5) 




vertices 



edges 



vertex partition 



probabilistic transitions 
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From parity games to simple stochastic games 



where n = m = \E\, hr = \Vr\ and size(5) = Y^ueVR^lvev where |5(t<)(v)| is the length of 

the binary representation of 5(m)(v). 

Plays and strategies. A play tt in a stochastic arena G is an infinite sequence {vo,vi,V2, . . .) of vertices 
such that for all / > we have (v,-,v,-(_i) G E. We denote by n the set of all plays. A strategy for a 
player is a recipe that prescribes how to play, i.e, given a finite history of play, a strategy defines the next 
move. Formally, a strategy for Eve is a function a : V * • Vg — > V such that for all w € V * and v € Vg we 
have {v,g{w • v)) £ E. We define strategies for Adam analogously, and denote by Z and F the set of all 
strategies for Eve and Adam, respectively. A strategy is memoryless, or positional if it is independent 
of the history of play and only depends on the current vertex, i.e, for all w,w' ^V* and v G Ve we have 
a{w - v) = a{w' - v). Hence a memoryless strategy can be described as a function o :Ve . 

Once a starting vertex v G V and strategies a for Eve and T for Adam are fixed, the outcome of the 
game is a random walk 7r(v, a,T) for which the probabilities of events are uniquely defined, where an 
event C n is a measurable set of plays. For an event C n, we write '^(^) for the probability 
that a play belongs to if the game starts from the vertex v and the players follow the strategies a and 
T. In case of 2-player arenas, if we fix positional strategies a, z, and a starting vertex v, then the play 
;r(v,a,T) obtained is unique and consists in a simple path (vo,vi,...v/_i) and a cycle (v/,v/+i,...,V;fe) 
executed infinitely often, i.e, the play is a "lasso-play": (vq, vi, . . . ,v/_i) ■ (v/,v/+i, . . . ,v;t)"'- 

Qualitative objectives. We specify qualitative objectives for the players by providing a set of winning 
plays 4> C n for each player. We say that a play n satisfies the objective O if ;r G <I>. We study only zero- 
sum games, where the objectives of the two players are complementary, i.e, if Eve has the objective 
then Adam has the objective n \ <I>. 

• Reachability objectives. Given a set T C V of "target" vertices, the reachability objective requires 
that some vertex of T be visited. The set of winning plays is Reach(r) = {(vo,vi,V2,. . .) G n | 
Vk^T for some k > 0}. 

• Parity objectives. Let : V — > N be a function that assigns a priority p{v) to every vertex v G V. 
For a play n = (vo,vi, . . .) G H, we define Inf(;r) = {v G V | v<: = v for infinitely many k} to be 
the set of vertices that occur infinitely often in K. The parity objective is defined as Parity (p) = 
{;r G n I min(p(Inf(;r))) is even}. In other words, the parity objective requires that the minimum 
priority visited infinitely often is even. 

Quantitative objectives. A quantitative objective is specified as a measurable function / : IT — ^ M. In 
zero-sum games the objectives of the players are functions / and — /, respectively. We consider two 
classes of quantitative objectives, namely, mean-payoff and discounted-payoff objectives. 

• Mean-payoff objectives. Let r : y — M be a real-valued reward function that assigns to every vertex 
V the reward r(v). The mean-payoff objective MeanPayoff(r) assigns to every play the "long-run" 
average of the rewards appearing in the play. Formally, for a play ;r = (vo,vi,V2, . . .) we have 



• Discounted-payoff objectives. Let r -.V ^Rhe a reward function and < A < I be a discount 

factor, the discounted-payoff objective DiscPayoff(A,r) assigns to every play the discounted sum 
of the rewards in the play. Formally, for a play ;r = (vq, vi , V2, . • .) we have 



1 



n 



MeanPayoff(r)(7r) =liminf 



n 



DiscPayoffa,r)(;r) = (1 - A) • lim V A'-r(v/) 
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Values and optimal strategies. Given objectives <I> C n for Eve and IT \ <I> for Adam, and measurable 
functions / and — / for Eve and Adam, respectively, we define the value functions (E) and (A) for Eve 
and Adam, respectively, as the following functions from the vertex space V to the set M of reals: for all 
vertices v G V, let 

(E)(0) (v) = supinfP,^''(<I>); (E)(/)(v) = sup inf E,^'^[/]; 

(A)(n\<I>)(v) = supinf P,':''''(n\<I>); (A)(-/)(v) = sup inf E,^'''[-/]. 

In other words, the values (E)(0) (v) and (E)(/)(v) give the maximal probability and expectation 
with which Eve can achieve her objectives O and / from vertex v, and analogously for Adam. The 
strategies that achieve those values are called optimal: a strategy a for Eve is optimal from the vertex v 
for the objective if (E)(0) (v) = infTerIP'?'^(*J*); and a is o/jf/maZ from the vertex v for / if (E)(/)(v) = 
inf lE,^'^ [/] . The optimal strategies for Adam are defined analogously. 

Theorem 1 (Memoryless determinacy [9]). For all stochastic arenas, 

1. For all objectives O such that O is either a reachability or a parity objective, for all vertices v we 
have 

(E)(0)(v) + (A)(n\<I>)(v) = l. 

Memoryless optimal strategies exist for both players from all vertices. Furthermore, for the case 
ofl-player arena, then for all v € V, (E)(0)(v) € {0, 1}. 

2. For all objectives / : H ^ R such that f is either a mean-payoff or discounted-payoff objective, 
for all vertices v we have 

(E)(/)(v) + (A)(-/)(v)=0. 
Memoryless optimal strategies exist for both players from all vertices. 

Games. A stochastic game is given by an arena and an objective. As a special case, a 2-player game is 
given by a 2-player arena and an objective. For instance, a 2-player parity game is a couple (G, Parity (p) ) , 
where G is a 2-player arena, and a stochastic reachability game is a couple (G,Reach(r)), where G is a 
stochastic arena. 

We define simple stochastic games to be special case of stochastic reachability game where V con- 
tains two distinguished absorbing vertices Vwin and viose and the reachability set is T = {vwin}- Once 
a play reached one of the two vertices v^in or viose, the game is stopped as its outcome is fixed. A 
simple stochastic game has the stopping property if for all strategies a and T and all vertices v G V, 

P(^''^(Reach{Vwin,Vlose}) = 1- 

Decision problems for games. Given an arena G, an objective O, a starting vertex v G V and a rational 
threshold q gQ, the decision problem we consider is whether (E) (0)(v) > q. It follows from Theorem[T] 
that in 2-player arenas, with a parity objective Parity (p), for a vertex v we have (E) (Parity (;7))(v) G 
{0, 1}. If the value is 1, then we say that Eve is winning, otherwise Adam is winning. 

3 A direct reduction 

In this section, we present a direct reduction to show that determining the winner in 2-player parity games 
can be reduced to the decision problem of simple stochastic games with the threshold ^. Specifically, 
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Figure 1 : From parity games to simple stochastic games 



from a 2-player parity game (G, Parity (p)) and a starting vertex v we siiow how to construct in polyno- 
mial time a stochastic arena D\{G) with a reachability objective Reach(vwin) such that Eve is winning in 
G from V for the parity condition if and only if (E)(Reach(vwin))(v) > 5 in 1H(G). 

Construction of the stochastic arena. We now present the construction of a stochastic arena d\{G) : 

5H(G) = ((y W£ W {Vwin, Vlose},^'), « {Vwin}, W {viose},^), 5) , 

where 1+) denotes the disjoint union. The set of edges and the transition function is as follows: 

E' = {(M,(M,v));((M,v),v);((M,v),Vwin) I (",v) G£,;?(v) even} 
U {(M,(M,v));((M,v),v);((M,v),viose) I (w,v) e£',p(v) odd} 

u {( ^win ) ^win ); (Vlose, Viose)} 

The transition function 5 is defined as follows: if p{v) is even, 5((m, v))(vwin) =Pv and 5{{u,v)){v) = 
1 —Pv, if p{v) is odd, v))(viose) = Pv and 5((m,v))(v) = 1 — Py- We will describe Py as reals in the 
interval (0, 1) satisfying certain conditions, and we will prove correctness of the reduction as long as the 
conditions are satisfied. 

We present a pictorial description of the reduction in Figure[T] for each edge (m, v) G E, we consider 
the simple gadget, if p{v) is even (resp. odd), that has an edge to the sink Vwin (resp. viose) with probability 
Py, and follows the original edge otherwise with probability 1 — Py Hexagonal vertices can be either Eve's 
or Adam's and triangle vertices are random vertices. Vwin will be depicted by a smiling face and viose by 
a sad face. 

The new arena simulates the initial arena, and additionally features two absorbing vertices Vwin and 
Vlose- To simulate a transition from u to v, the new arena includes a random vertex that follows the 
transition with high probability 1 — Py or stops the game by going to Vwin or viose with small probability 
Py. The intuition is that if v has even priority, then Eve is rewarded to visit it by having a small yet 
positive chance of winning, and symmetrically if v has odd priority for Adam. 

Playing forever in 9^(G), the outcome of the play will be favorable for Eve {i.e reach Vwin) if she 
manages to see even priorities many times. Furthermore, the reward a player receives for visiting a 
vertex with good priority must depend on this priority: seeing a very small even priority gives more 
chance to win than a higher one. Indeed, if different priorities are seen infinitely often, the outcome of 
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0- 

Figure 2: General shape of a play in 2-player game where Eve and Adam play positionally 

the play must be in favor of the parity of the lowest priority. This leads to the following assumptions on 
the probabilities /\,'s. 

Assumptions on the transition function. We consider the following assumptions on Py's: 

veV 6 

and for all v G V, let/^^^j = {u \ p{u) oM,p{u) > p{v)} and/g^e„ = {u \ p{u) even,p{u) > p(y)}: 

I Pu<^-Pr (Al) £ Pu<^-P. (A2) 

We provide the reader with intuitions on the three assumptions (Aq) — (A2). The assumption (Aq) 
ensures that probabilities are small enough such that plays in 1H(G) will last enough to take into account 
the priorities seen infinitely often, and not only the first ones. The assumptions (Ai) and (A2) ensure that 
if V has the lowest priority and is seen infinitely often, no matters how often higher priorities are seen, 
the outcome will only depend on the priority of v and not on the others. 

We present a sequence of properties to prove correctness of the reduction given the three assumptions 
(Ao) - (A2) hold. 

First note that the set {vwin, viose} is reached with probability 1, since at each step there is a positive 
probability to reach it. Another remark is that there is a one-to-one correspondence between strategies in 
G and 9^(G), so we identify strategies in G or 9^(G). 

We will prove that for all v G V, Eve wins from v if and only if (E)(Reach(vwin))(v) > j. 

Thanks to Theorem [T] there are memory less optimal strategies in both games: from now on, we 
consider a and T two memoryless strategies. The key property is that the resulting play n = (vo,vi, . . .) 
has a simple shape (shown in Figure |2l): the play consists in a simple path ^ from vq to v/, and then a 
simple cycle 'rf is executed forever. Let c be the lowest priority infinitely visited, 

n = (vo,vi,...,v;_i) • (v/,v/+i,...,v/+^_i)'° 

where p{vi+i) =c, ^ = {vq, vi , . . . v/_i} and ^ = {v/,v/+i, . . . ,v/+^_i} are pairwise disjoint. 

We now consider the corresponding situation in y{{G): the random walk n = 7r(vo,CJ,T) mimics n 
until it takes an edge to v^in or viose, stopping the game. We denote by tt,- the random variable for the 
vertex of n. Since the starting vertex and the strategies are fixed, we will abbreviate Pjfg'^ by P. 

We consider the possible different scenarios. There are two possibilities to reach v^in or viose^ the 
first is to reach it during the first / steps, i.e during the simple path, the second is to reach it after that, i.e 
during the simple cycle, after the simple path has been crossed. 

Notations for events. We define, for v G {vwin, viose},^, J > the following measurable events. 
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• The event Reach(v, j) denotes that v has been reached within j steps, i.e, Reach(v, j) = {n \ Ttj = v} 
(note that this is equivalent to {n \ 3i < j, % = v}). 

• The event Cross(j) denotes that neither v^in nor viose has been reached within j steps, i.e, 

CrOSs(j) = {n \ Uj ^ {Vwm, Viose}}- 

• The event ReachPath(v) denotes that v has been reached within / steps, i.e, ReachPath(v) = 
Reach(v, /). 

• The event CrossPath denotes that neither v^in nor viose has been reached within / steps, i.e, 
CrossPath = Cross (/). 

• The event ReachLoop(v, k) denotes that v has been reached within l + k-q steps, i.e, 
ReachLoop(v,^) = Reach (v, / + • q): intuitively, v has been reached either during the path or one 
of the k first crossings of the loop. 

• The event CrossLoop(A;) denotes that neither Vwin nor viose has been reached within l + k-q steps, 
i.e, CrossLoop(^) = Cross(/ + A;-^/): intuitively, during the path and the k first crossings of the loop 
neither v^in nor viose has been reached. 

We define the following probabilities: 

• a = P(CrossPath) is the probability to cross the path ,"3^; 

• j8 = P(ReachLoop(vwin, 1) | CrossPath) is the probability to reach v^in while following the simple 
cycle ^ for the first time, assuming the path ^ was crossed, and similarly 

• 7 = P(ReachLoop(viose, 1) | CrossPath). 

We take two steps: the first step is to approximate P (Reach (vwin)) and P (Reach (viose)) using a, j3 and 
7, and the second is to make use of assumptions (Aq) — (A2) to evaluate a, j8 and 7. 

Approximations for P(Reach(vwm)) and P (Reach (viose))- We rely on the following four properties. 

Property 1. Fork > 1, we have P(ReachLoop(vwin,^) | CrossLoop(^— 1)) = j8 and similarly 
P(ReachLoop(viose,^) | CrossLoop (/: — 1)) = 7. 

Proof. Since a and T are memoryless, the random walk n is "memoryless": from v/, crossing the loop 
for the first time or for the k-th time will give the same probability to escape to v^jn or viQ^g. H 

Property 2. We have, for all k>l, P(CrossLoop(;t - 1) | CrossPath) = (1 - (/3 + r)f^^. 

Proof. By induction on ^ > 1. The case ^ = 1 follows from CrossLoop(O) = CrossPath. Let ^ > 1: 

P(CrossLoop()t) I CrossPath) 

= P(CrossLoop(A:) [ CrossLoop(A: - 1)) • P(CrossLoop(A: - 1) [ CrossPath) 
= (1 - i3 - 7) • P(CrossLoop(/t - 1) | CrossPath) 

The first equality is a restatement and the second is a result of Property [U We conclude thanks to the 
induction hypothesis. ■ 

Property 3. We have P(Reach(vwin) I CrossPath) = and similarly P (Reach (viose) | CrossPath) = 
y 

P+r 
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A simple intuition on this calculation is by referring to a "looping" game. Eve and Adam play a 
game divided in possibly infinitely many rounds. Each round corresponds to cross the loop once: while 
doing so, Eve wins with probability j8, Adam wins with probability 7 and the round is a draw otherwise, 
with probability 1 — (jS + 7). In case of a draw, the game goes on another round. Once a player won, the 
game is stopped, which corresponds to reach v^in or viose- In this game. Eve wins with probability 
and Adam with probability 

Proof. We have the following equalities: 

P(Reach(vwm) | CrossPath) 

= ^^^j P(ReachLoop(vwin,^) nCrossLoop(^— 1) | CrossPath) 

= P(ReachLoop(Vwin)^) I CrossLoop(A: — 1)) • P(CrossLoop(A: — 1) | CrossPath) 

= ir=ii8-(i-(^+7)r^ 

_ p 

The disjoint union Reach(vwin) fl CrossPath = l±)i:>i(ReachLoop(vwm,^) nCrossLoop(^ — 1)) gives the 
first equality. The second is a restatement, the third equality follows from Property |2]and Property [T] The 
other equality is achieved by the same proof, replacing v^in by viose and using Property [U accordingly. ■ 

Property 4. We have P(Reach(vwin)) > a • and similarly P(Reach(viose)) > « • 

The intuition behind these two equalities is that we try to ignore what happens while crossing the 
path, as reaching either v^in or viose is not correlated to the priorities seen infinitely often. In this context, 
the multiplicative constant a stands for the loss due to crossing the path. As soon as the path is crossed, 
what happens next will be correlated to the priorities seen infinitely often along the play. We will see that 
the value of the looping game described above captures the outcome of the parity game. 

Proof. We have the following equalities: 

P(Reach(vwm)) = P(ReachPath(vwm)) + P(Reach(vwin) n CrossPath) 

= P(ReachPath(vwin)) + P(CrossPath) • P(Reach(vwin) | CrossPath) 
> P(CrossPath) • P(Reach(vwm) | CrossPath) 



From the disjoint union Reach(vwin) = ReachPath(vwin) W (Reach(vwin) n CrossPath) follows the first 
equality. The second is restatement, the inequality is straightforward, the following equality is a restate- 
ment and the last equaUty follows from Property |3]and definition of a. The other claim is achieved by 
the same proof, replacing v^in by viose and using [3] accordingly. ■ 

Approximations for a, j3 and 7. Note that for / > 1, we have P(jr; G {vwin, viose} I ^i-i ^ {vwin, viose}) = 
Py., which follows from the construction of 1H(G): taking an escape edge comes with probability 

Property 5. Given the assumption (Aq) is satisfied, we have Of > |- 

Intuitively, this property means that the loss due to crossing the path is bounded by a constant. 
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Proof. Since the path is simple, each vertex is visited at most once. Let / = {/ 1 v,- € J^} be the set of 
vertices visited by this path. Then 

1 - a = {Vwm,Vlose} H JT; G {Vwin, Vlose}) < ^l^v, < ^ 

The first equaUty follows from the disjoint union: 

n\CrossPath 

= ReachPath ( Viose ) U ReachPath ( v^in ) 

win ) Vlose 

} n 71,- G {v 

win 1 Vlose}) 



The last inequality follows from assumption (A 



■ 



Property 6. Given the assumptions {A\) and (A2) are satisfied, if Eve wins the play 7r(vo,C7,T) in G, 
then we have the following inequalities: 



(2) 7 < f-^^ 



Vl+l 

p 

3 ^vi+i 



and similarly if Adam wins the play 7r(vo, CJ, t) in G, then we have the following inequalities: 



(1) 7 > Pv,+i 
3 ■ 



(2) P < lPr,+i. 



Intuitively, this property means that if Eve wins in the parity game, then the looping game is winning 
for her with probability more than t, and similarly if Adam wins in the parity game, then the looping 
game is winning for him with probability more than |. 

Proof. We prove inequalities in both cases simultaneously. 

1. It relies on the fact that the loop starts by getting to v/+i, i.e that either, if Eve wins: 7r/+i = 
Vwin n CrossPath C ReachLoop(vwin, 1) H CrossPath, or if Adam wins: 7r/+i = viose H CrossPath C 
ReachLoop( Vlose, 1) n CrossPath. 

2. Assume Eve wins, let 7 = {/ | v,- G A p{vi) odd} be the set of vertices with odd priority visited 
by the loop. Then 

7= £P(jri_l / Vlose n TT,- = Vlose) < 



ieJ ieJ 

'odd 



(Ai) allows to conclude, since v/+i has the lowest priority of the loop, thus J C J^l^^ . Similarly, 
if Adam wins, the same proof using assumption (A2) concludes. 



It follows from Property @]|5] and |6]that under assumptions (Aq) — (A2), we have the desired equiva- 
lence: Eve wins in G if and only if P(Reach(vwin)) > 5 in 9^(G). 

Theorem 2. Under the three assumptions (Ao) — (A2), we have: for all 1-player arenas G equipped with 
parity objective Parity (;?), 9^(G) equipped with reachability condition Reach(vvi,in) is a simple stochastic 
game with the stopping property, and for all v G V, Eve wins for the parity condition from v in G if and 
only j/ (E) (Reach ( Vwin )) (v) > \ in 1H(G). 
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Transition probabilities. We now present transition probabilities satisfying the assumptions (Aq) — (A2) 
that can be described with 0(Iog(«)) bits. Let /? : V — )■ N the priority function in G, we first build an 
equivalent parity function p'. We sort vertices with respect to p and define the following monotone 
mapping: 

• the lowest priority becomes either 4 if it is even or 5 if odd; 

• proceeding from the lowest to the greatest, a vertex is assigned the lowest integer greater than the 
last integer used, matching its parity. 

This ensures that all priorities are distinct, and the highest priority is at most 2n + 2. Then, apply the 
reduction *H with Pj = We argue that the probability transition function satisfies (Aq) — (A2). We 
have 




Hence (Aq) is satisfied. For all v G V 



Hence (Ai) is satisfied and a similar argument holds for (A2). Hence we have the following result. 

Theorem 3. *H is a polynomial-time reduction from 2-player parity games to simple stochastic games. 
Furthermore, for all 2-player parity games (G, Parity (/?)), the size of the stochastic arena 1H(G) is 0{\E\- 



In this section, we discuss related works. Deciding the winner in parity games is equivalent to the model- 
checking problem of modal mu-calculus. A reduction from model-checking games to simple stochastic 
games was defined in ITj. Another reduction, using a discounted mu-calculus, from concurrent parity 
games to concurrent discounted games was presented in fTl. Our intend was to propose a direct and 
simple reduction from 2-player parity games to simple stochastic games. In the first subsection, we 
discuss its efficiency compared to the previously known three step reduction. In the second subsection, 
we use remarks from ||5] to prove that solving stochastic parity, mean-payoff, discounted-payoff games 
as well as simple stochastic games is in UP fl coUP. 

4.1 Discounting the discounted 

In this subsection we present the classical sequence of reductions: from 2-player parity games to 2-player 
mean-payoff games lH, from 2-player mean-payoff games to 2-player discounted-payoff games Q, and 
from 2-player discounted games to simple stochastic games 0- 

Parity games to mean-payoff games. A 2-player parity game with n vertices and d different priorities 
can be reduced in polynomial time to a 2-player mean payoff game on the same arena using rewards 
from the set {— . . . ,«^}, such that Eve wins the parity game if and only if the value of Eve in the 
mean-payoff game is at least ||6l. 

Mean-payoff games to discounted-payoff games. A 2-player mean payoff game with n vertices whose 
reward function ranges from —B to B can be reduced in polynomial time to a discounted-payoff game on 




log(|V|)). 
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the same arena with discount factor A such that A > 1 — ^ such that the value of Eve in the mean-payoff 
game is at least if and only if the value of Eve in the discounted-payoff game is at least ||91 . 

Discounted-payoff games to simple stochastic games. A 2-player discounted-payoff game with n 
vertices can be reduced in polynomial time to a simple stochastic game using n + m vertices including m 
random vertices and 4 • m edges such that the value of Eve in the discounted-payoff game is at least if 
and only if the value of Eve in the simple stochastic game is at least ^ |[9l. 

Size of the resulting games. We now analyze the size of the games produced by the three step reduction. 
Let G a 2-player parity game having n vertices, m edges and d distinct priorities. The first reduction to 
a 2-player mean payoff game yields a game with n vertices, m edges and rewards can be specified 
with 0{d • log(?i)) bits. The second reduction to a discounted-payoff game yields a 2-player game with 
n vertices, m edges, rewards specified with 0{d - log^n)) bits and the discount factor specified with 
0{d ■log{n)) bits. Finally, the last reduction to a simple stochastic game yields a game with n + m 
vertices, with m random vertices, 4 -m edges and each probability of transition specified with 0{d ■log{n)) 
bits, thus the size of the transition function is 0{m ■ (log(?i + m) -\-d ■ log(«))) = 0{m-d ■ log(n)). Since 
d is 0{n), in the worst case the size of the game obtained by the three step reduction is 0{m ■ n ■ log(«)). 

4.2 The complexity of stochastic games 

Another motivation to present a clean and direct reduction from 2-player parity games to simple stochas- 
tic games was to extend it from stochastic parity games to simple stochastic games. As for the de- 
terministic case, such a reduction is known, but again through stochastic mean-payoff and stochastic 
discounted-payoff, and is more involved |3 1. Although we did not manage to adapt our proofs to extend 
our direct reduction from stochastic parity games to simple stochastic games, we believe it is possi- 
ble. Our main difficulty is that the shape of a play, even if both players play positionally, is no more a 
"lasso-play". Indeed, even if the parity condition is satisfied with probability more than half, we cannot 
guarantee that an even priority will be visited within a linear number of steps. 

In the remaining of this subsection, we gather several results and make two very simple observations 
of the result of Condon [5l to prove that the decision problem for simple stochastic games is in UPfl 
coUP, in a similar fashion to the proof of [6 |, which was stated for the simpler case of 2-player discounted 
games. 

The reduction of stochastic parity to stochastic mean-payoff games was established in ||3l and reduc- 
tion of stochastic mean-payoff and discounted games to simple stochastic games was established in 121 . 
The Figure |3] summarizes all the reductions. We now argue that simple stochastic games can be decided 
in UPncoUP. 

Simple stochastic games in UP n coUP. First, it was shown in 14] that simple stochastic games with 
arbitrary rational transition probabilities can be reduced in polynomial time to stopping simple stochastic 
games where random vertices have two outgoing edges each with probability half. Second, it follows 
from the result of f5l that the value vector of a stopping simple stochastic game is the unique solution of 
the following equations set: 

(E)(Vwin) = l 
(E)(viose)=0 

< (E)(v) = max(,,^/)e£ valE(v') if v G Vg 

(E) (v) = mm(^y)^E valE(v') if v G Va 

. (E)(v) 5(v,v')-valE(v') ifvEVs 
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Figure 3: Reductions 



Hence an algorithm can guess the value vector and check that it is actually the solution of the equation 
set. To prove the desired result we need to show that the guess is of polynomial size and the verification 
can be achieved in polynomial time. It follows from [4 1 that for simple stochastic games with n vertices 
and all probabilities one half, the values are of the form p/q, where p,q are integers, < p,q < 4"^^. 
Hence the length of the guess is at most n •log(4"^^) = 0{n^), which is polynomial. Thus the guess 
is of polynomial size and the verification can be done in polynomial time. The unique solution implies 
that simple stochastic games are in UP, and the coUP argument is symmetric. Along with the reductions 
of m O we obtain the following result. 

Theorem 4 (Complexity of stochastic games). For all stochastic arenas, for all objectives such that 
(j) is a parity, mean-payoff, discounted-payoff or reachability objective, the decision problem of whether 
(E) (0 ) (v) > q, for a rational number q is in UP fl coUP. 
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